Discovery of Constraints from Data for Information System Reverse Engineering
نویسنده
چکیده
The extraction of functional dependencies is a fundamental activity in the database design recovery process which is part of on an overall information systems reverse engineering effort. Existing algorithms for this task are computationally expensive and appear to be infeasible if applied to large legacy database instances, e.g., their performance deteriorated when number of attributes or/and instances is large and they cannot tolerate erroneous data that may occur in deployed commercial systems. The contributions of this paper are as follows. We propose two algorithms for discovering functional dependencies from data. The collective-FD algorithm, which is based on top-down approach, eliminates redundant specialised functional dependencies to be proposed. The attribute-list algorithm, which is based on bottom-up approach, enables more accurate functional dependency hypotheses to be discovered. In anticipating noisy data, we propose an effective method to discover possible data errors and compute partial functional dependencies. The result is an error-tolerant functional dependencies discovery approach that is more applicable to real world databases for design recovery.
منابع مشابه
Survey on Perception of People Regarding Utilization of Computer Science & Information Technology in Manipulation of Big Data, Disease Detection & Drug Discovery
this research explores the manipulation of biomedical big data and diseases detection using automated computing mechanisms. As efficient and cost effective way to discover disease and drug is important for a society so computer aided automated system is a must. This paper aims to understand the importance of computer aided automated system among the people. The analysis result from collected da...
متن کاملInformal and Intelligent Acquisition of Semantic Constraints in Database Design and Reverse Engineering 1
The main objective of database modelling is the design of a database that is correct and can be processed eeciently by a database management system. The eeciency and correctness of a database depends among other things on knowledge about database semantics because semantic constraints are the prerequisite for normal-isation and restructuring operations. Acquisition of semantic constraints remai...
متن کاملData-mining-based automated reverse engineering and defect discovery
A data mining based procedure for automated reverse engineering and defect discovery has been developed. The data mining algorithm for reverse engineering uses a genetic program (GP) as a data mining function. A GP is an evolutionary algorithm that automatically evolves populations of computer programs or mathematical expressions, eventually selecting one that is optimal in the sense it maximiz...
متن کاملApplication of Rough Set Theory in Data Mining for Decision Support Systems (DSSs)
Decision support systems (DSSs) are prevalent information systems for decision making in many competitive business environments. In a DSS, decision making process is intimately related to some factors which determine the quality of information systems and their related products. Traditional approaches to data analysis usually cannot be implemented in sophisticated Companies, where managers ne...
متن کاملReverse Engineering by Visualizing and Querying
The automatic extraction of high-level structural information from code is important for both software maintenance and reuse. Instead of using specialpurpose tools, we explore the use of a general-purpose data visualization system called Hy+ for querying and visualizing information about object-oriented software systems. Hy+ supports visualization and visual querying of arbitrary graph-like dat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997